Cross-Dialectal Data Transferring for Gaussian Mixture Model Training in Arabic Speech Recognition

نویسندگان

  • Po-Sen Huang
  • Mark Hasegawa-Johnson
چکیده

Dialectal Arabic speech recognition is a difficult problem and is relatively less studied. In this paper, we propose a cross-dialectal Gaussian mixture model training criteria to transfer knowledge from one domain to the other by data sharing. Specifically, phone classification experiments on West Point Modern Standard Arabic Speech corpus and Babylon Levantine Arabic Speech corpus demonstrate that the cross-dialectal training improves phone classification accuracy significantly, especially when a small amount of MSA data is transferred. Keywords—Transfer learning, Arabic automatic speech recognition, Gaussian Mixture Models, dialectal Arabic

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Recognizing the Emotional State Changes in Human Utterance by a Learning Statistical Method based on Gaussian Mixture Model

Speech is one of the most opulent and instant methods to express emotional characteristics of human beings, which conveys the cognitive and semantic concepts among humans. In this study, a statistical-based method for emotional recognition of speech signals is proposed, and a learning approach is introduced, which is based on the statistical model to classify internal feelings of the utterance....

متن کامل

State-Dependent Phoneme-Based Model Merging for Dialectal Chinese Speech Recognition

Aiming at building a dialectal Chinese speech recognizer from a standard Chinese speech recognizer with a small amount of dialectal Chinese speech, a novel, simple but effective acoustic modeling method, named statedependent phoneme-based model merging (SDPBMM) method, is proposed and evaluated, where a tied-state of standard triphone(s) will be merged with a state of the dialectal monophone th...

متن کامل

مقایسه روش های طیفی برای شناسایی زبان گفتاری

Identifying spoken language automatically is to identify a language from the speech signal. Language identification systems can be divided into two categories, spectral-based methods and phonetic-based methods. In the former, short-time characteristics of speech spectrum are extracted as a multi-dimensional vector. The statistical model of these features is then obtained for each language. The ...

متن کامل

Multilingual speech recognition: the 1996 byblos callhome system

This paper describes the 1996 Byblos Callhome speech recognition system for Spanish and Egyptian Colloquial Arabic. The system uses a combination of Phoneticly Tied-Mixture Gaussian HMMs and State-Clustered Tied-Mixture Gaussian HMMs in a multiple pass decoder. We focus here on the aspects of the system which are language specific and demonstrate the adaptability of the Byblos English system to...

متن کامل

Automatic Diacritization Of Arabic For Acoustic Modeling In Speech Recognition

Automatic recognition of Arabic dialectal speech is a challenging task because Arabic dialects are essentially spoken varieties. Only few dialectal resources are available to date; moreover, most available acoustic data collections are transcribed without diacritics. Such a transcription omits essential pronunciation information about a word, such as short vowels. In this paper we investigate v...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012